correlation score
Vision Mamba Mender
In contrast to these approaches, this paper proposes the Vision Mamba Mender, a systematic approach for understanding the workings of Mamba, identifying flaws within, and subsequently optimizing model performance. Specifically, we present methods for predictive correlation analysis of Mamba's hidden states from both internal and external perspectives,
- Asia > China > Zhejiang Province > Ningbo (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
e3251075554389fe91d17a794861d47b-Supplemental.pdf
Now,we describe the latencymeasurement pipeline for desktop GPUs, Jetson, serverCPUs, and mobile phone. Furthermore, evenwith the same GPU device, the correlation scores are not high ifthebatch sizes are different. Figure A.1: Visualization of 10 reference neural architectures we used for NAS-Bench-201 search space. Werandomlyselected10reference architectures for each search space (NAS-Bench-201, FBNet, and MobileNetV3) and used them across all experiments and devices of the same search space. In Figure A.1, we visualize 10 reference architectures that we used in NAS-Bench-201 search space.
Beyond Top Activations: Efficient and Reliable Crowdsourced Evaluation of Automated Interpretability
Oikarinen, Tuomas, Yan, Ge, Kulkarni, Akshay, Weng, Tsui-Wei
Interpreting individual neurons or directions in activation space is an important topic in mechanistic interpretability. Numerous automated interpretability methods have been proposed to generate such explanations, but it remains unclear how reliable these explanations are, and which methods produce the most accurate descriptions. While crowd-sourced evaluations are commonly used, existing pipelines are noisy, costly, and typically assess only the highest-activating inputs, leading to unreliable results. In this paper, we introduce two techniques to enable cost-effective and accurate crowdsourced evaluation of automated interpretability methods beyond top activating inputs. First, we propose Model-Guided Importance Sampling (MG-IS) to select the most informative inputs to show human raters. In our experiments, we show this reduces the number of inputs needed to reach the same evaluation accuracy by ~13x. Second, we address label noise in crowd-sourced ratings through Bayesian Rating Aggregation (BRAgg), which allows us to reduce the number of ratings per input required to overcome noise by ~3x. Together, these techniques reduce the evaluation cost by ~40x, making large-scale evaluation feasible. Finally, we use our methods to conduct a large scale crowd-sourced study comparing recent automated interpretability methods for vision networks.
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.88)
FR-TTS: Test-Time Scaling for NTP-based Image Generation with Effective Filling-based Reward Signal
Xu, Hang, Huang, Linjiang, Zhao, Feng
Test-time scaling (TTS) has become a prevalent technique in image generation, significantly boosting output quality by expanding the number of parallel samples and filtering them using pre-trained reward models. However, applying this powerful methodology to the next-token prediction (NTP) paradigm remains challenging. The primary obstacle is the low correlation between the reward of an image decoded from an intermediate token sequence and the reward of the fully generated image. Consequently, these incomplete intermediate representations prove to be poor indicators for guiding the pruning direction, a limitation that stems from their inherent incompleteness in scale or semantic content. To effectively address this critical issue, we introduce the Filling-Based Reward (FR). This novel design estimates the approximate future trajectory of an intermediate sample by finding and applying a reasonable filling scheme to complete the sequence. Both the correlation coefficient between rewards of intermediate samples and final samples, as well as multiple intrinsic signals like token confidence, indicate that the FR provides an excellent and reliable metric for accurately evaluating the quality of intermediate samples. Building upon this foundation, we propose FR-TTS, a sophisticated scaling strategy. FR-TTS efficiently searches for good filling schemes and incorporates a diversity reward with a dynamic weighting schedule to achieve a balanced and comprehensive evaluation of intermediate samples. We experimentally validate the superiority of FR-TTS over multiple established benchmarks and various reward models. Code is available at \href{https://github.com/xuhang07/FR-TTS}{https://github.com/xuhang07/FR-TTS}.
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)
ECGXtract: Deep Learning-based ECG Feature Extraction for Automated CVD Diagnosis
Abuzied, Youssif, AbdEltawab, Hassan, Gaber, Abdelrhman, ElBatt, Tamer
This paper presents ECGXtract, a deep learning-based approach for interpretable ECG feature extraction, addressing the limitations of traditional signal processing and black-box machine learning methods. In particular, we develop convolutional neural network models capable of extracting both temporal and morphological features with strong correlations to a clinically validated ground truth. Initially, each model is trained to extract a single feature, ensuring precise and interpretable outputs. A series of experiments is then carried out to evaluate the proposed method across multiple setups, including global versus lead-specific features, different sampling frequencies, and comparisons with other approaches such as ECGdeli. Our findings show that ECGXtract achieves robust performance across most features with a mean correlation score of 0.80 with the ground truth for global features, with lead II consistently providing the best results. For lead-specific features, ECGXtract achieves a mean correlation score of 0.822. Moreover, ECGXtract achieves superior results to the state-of-the-art open source ECGdeli as it got a higher correlation score with the ground truth in 90% of the features. Furthermore, we explore the feasibility of extracting multiple features simultaneously utilizing a single model. Semantic grouping is proved to be effective for global features, while large-scale grouping and lead-specific multi-output models show notable performance drops. These results highlight the potential of structured grouping strategies to balance the computational efficiency vs. model accuracy, paving the way for more scalable and clinically interpretable ECG feature extraction systems in limited resource settings.
- Asia > India (0.04)
- Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)
- Europe > Montenegro (0.04)
- (2 more...)
Vision Mamba Mender
In contrast to these approaches, this paper proposes the Vision Mamba Mender, a systematic approach for understanding the workings of Mamba, identifying flaws within, and subsequently optimizing model performance. Specifically, we present methods for predictive correlation analysis of Mamba's hidden states from both internal and external perspectives,
- Asia > China > Zhejiang Province > Ningbo (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Safe-SAIL: Towards a Fine-grained Safety Landscape of Large Language Models via Sparse Autoencoder Interpretation Framework
Weng, Jiaqi, Zheng, Han, Zhang, Hanyu, He, Qinqin, Tao, Jialing, Xue, Hui, Chu, Zhixuan, Wang, Xiting
Increasing deployment of large language models (LLMs) in real-world applications raises significant safety concerns. Most existing safety research focuses on evaluating LLM outputs or specific safety tasks, limiting their ability to address broader, undefined risks. Sparse Autoencoders (SAEs) facilitate interpretability research to clarify model behavior by explaining single-meaning atomic features decomposed from entangled signals. However, prior applications on SAEs do not interpret features with fine-grained safety-related concepts, thus inadequately addressing safety-critical behaviors, such as generating toxic responses and violating safety regulations. For rigorous safety analysis, we must extract a rich and diverse set of safety-relevant features that effectively capture these high-risk behaviors, yet face two challenges: identifying SAEs with the greatest potential for generating safety concept-specific neurons, and the prohibitively high cost of detailed feature explanation. In this paper, we propose Safe-SAIL, a framework for interpreting SAE features within LLMs to advance mechanistic understanding in safety domains. Our approach systematically identifies SAE with best concept-specific interpretability, explains safety-related neurons, and introduces efficient strategies to scale up the interpretation process. We will release a comprehensive toolkit including SAE checkpoints and human-readable neuron explanations, which supports empirical analysis of safety risks to promote research on LLM safety.
- North America > United States (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models
Sindhujan, Archchana, Qian, Shenbin, Matthew, Chan Chi Chun, Orasan, Constantin, Kanojia, Diptesh
Large Language Models (LLMs) have shown remarkable performance across a wide range of natural language processing tasks. Quality Estimation (QE) for Machine Translation (MT), which assesses the quality of a source-target pair without relying on reference translations, remains a challenging cross-lingual task for LLMs. The challenges stem from the inherent limitations of existing LLM-based QE systems, which are pre-trained for causal language modelling rather than regression-specific tasks, further elevated by the presence of low-resource languages given pre-training data distribution. This paper introduces ALOPE, an adaptive layer-optimization framework designed to enhance LLM-based QE by restructuring Transformer representations through layer-wise adaptation for improved regression-based prediction. Our framework integrates low-rank adapters (LoRA) with regression task heads, leveraging selected pre-trained Transformer layers for improved cross-lingual alignment. In addition to the layer-specific adaptation, ALOPE introduces two strategies-dynamic weighting, which adaptively combines representations from multiple layers, and multi-head regression, which aggregates regression losses from multiple heads for QE. Our framework shows improvements over various existing LLM-based QE approaches. Empirical evidence suggests that intermediate Transformer layers in LLMs provide contextual representations that are more aligned with the cross-lingual nature of the QE task. We make resultant models and framework code publicly available for further research, also allowing existing LLM-based MT frameworks to be scaled with QE capabilities.
- Europe > Austria > Vienna (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- (10 more...)
Dataset Cartography for Large Language Model Alignment: Mapping and Diagnosing Preference Data
Lee, Seohyeong, Kim, Eunwon, Lee, Hwaran, Chang, Buru
Human preference data plays a critical role in aligning large language models (LLMs) with human values. However, collecting such data is often expensive and inefficient, posing a significant scalability challenge. To address this, we introduce Alignment Data Map, a GPT-4o-assisted tool for analyzing and diagnosing preference data. Using GPT-4o as a proxy for LLM alignment, we compute alignment scores for LLM-generated responses to instructions from existing preference datasets. These scores are then used to construct an Alignment Data Map based on their mean and variance. Our experiments show that using only 33 percent of the data, specifically samples in the high-mean, low-variance region, achieves performance comparable to or better than using the entire dataset. This finding suggests that the Alignment Data Map can significantly improve data collection efficiency by identifying high-quality samples for LLM alignment without requiring explicit annotations. Moreover, the Alignment Data Map can diagnose existing preference datasets. Our analysis shows that it effectively detects low-impact or potentially misannotated samples. Source code is available online.
Vectors from Larger Language Models Predict Human Reading Time and fMRI Data More Poorly when Dimensionality Expansion is Controlled
Lin, Yi-Chien, Zhu, Hongao, Schuler, William
The impressive linguistic abilities of large language models (LLMs) have recommended them as models of human sentence processing, with some conjecturing a positive 'quality-power' relationship (Wilcox et al., 2023), in which language models' (LMs') fit to psychometric data continues to improve as their ability to predict words in context increases. This is important because it suggests that elements of LLM architecture, such as veridical attention to context and a unique objective of predicting upcoming words, reflect the architecture of the human sentence processing faculty, and that any inadequacies in predicting human reading time and brain imaging data may be attributed to insufficient model complexity, which recedes as larger models become available. Recent studies (Oh and Schuler, 2023) have shown this scaling inverts after a point, as LMs become excessively large and accurate, when word prediction probability (as information-theoretic surprisal) is used as a predictor. Other studies propose the use of entire vectors from differently sized LLMs, still showing positive scaling (Schrimpf et al., 2021), casting doubt on the value of surprisal as a predictor, but do not control for the larger number of predictors in vectors from larger LMs. This study evaluates LLM scaling using entire LLM vectors, while controlling for the larger number of predictors in vectors from larger LLMs. Results show that inverse scaling obtains, suggesting that inadequacies in predicting human reading time and brain imaging data may be due to substantial misalignment between LLMs and human sentence processing, which worsens as larger models are used.
- North America > United States > Ohio (0.04)
- Europe > Middle East > Malta > Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (2 more...)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.69)